perm filename LISP1.OLD[LSP,JRA]19 blob sn#192401 filedate 1975-12-16 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00018 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00003 00002	.Sec(Symbolic expressions,Symbolic expressions)
C00018 00003	.SS(Symbolic Expressions: abstract data structures)
C00025 00004	.<<Examples: s-exprs not sexprs>>
C00030 00005	.SS(Trees: representations of Symbolic expressions)
C00034 00006	.GROUP
C00038 00007	.REQUIRE "PROB1.PUB" SOURCE_FILE
C00039 00008	.SS(Primitive Functions)
C00051 00009	.<<car and cdr>>
C00058 00010	.<<discussion of components of function def>>
C00069 00011	.SS(Predicates and Conditional Expressions,conditional expression)
C00077 00012	%3
C00093 00013	Here's an intuitive description of such a predicate named %3equal%*.
C00099 00014	.REQUIRE "PROB2.PUB" SOURCE_FILE
C00106 00015	.SS(Sequences: abstract data structures,,P100:)
C00125 00016	.SS(Lists: representations of sequences,lists)
C00141 00017	.BEGIN TURN ON "←"
C00154 00018	.SS(A respite)
C00172 ENDMK
C⊗;
.Sec(Symbolic expressions,Symbolic expressions)
.SS(Introduction)

This book is a study of data structures and programming languages.
In particular it is the study of data structures and programming
languages centered around the language LISP. However this is not 
a manual to help you become a proficient LISP coder.
We will study many of the formal and theoretical aspects of languages  and
data structures as well as examining the practical applications of
data structures. We will show that this area of computer science
is a discipline of importance and beauty, and worthy of careful study.
How are we to proceed? How do we introduce rigor into a field
whose countenance is as %3ad hoc%* and diverse as that of programming?
We must also bear in mind that the results of our studies are to have
practical applications.
We must not pursue theory and rigor without proper
regard for practice. Our study is not that of pure mathematics; our
results will have applications in everyday programming practice.
However, for guidance let's look at mathematics. Here is a well-established discipline
rich in history and full of  results of both practical and theoretical
importance. Will our comparison of mathematics and programming languages
bear fruit or will it simply show our impudence and naivety? We shall see.

One of the more fertile, yet easily introduced areas of mathematics, is that of
elementary number theory. It is easy to introduce because
everyone knows something  about the natural numbers.
Number theory studies properties of a certain class of operations definable
over the set %dN%* of non-negative integers.
A very formal presentation might begin with a construction of %dN%*
from more primitive notions, but it is usually assumed that the reader
is familiar with the fundamental properties of %dN%*.
In either case the next step would be to define the
class of operations which we would allow on our domain.

We shall begin our study of LISP in a similar manner, 
as an investigation of a certain
class of operations definable over  a domain of objects, called Symbolic Expressions.
Though most people
know something about the integers, we are perhaps not so fortunate when it
comes to Symbolic expressions. 
We must define what we mean by "symbolic expression". 
Let's look to mathematics again
for help. If we asked someone to define the domain %dN%*, the definition
we would receive 
would depend on how familar that individual was with the properties of the
integers.

For most people and purposes,  the following  characterization of an integer
is satisfactory:

.BEGIN TABIT1(10);

%aI:%* \An integer is a sequence of decimal digits.
.END

However this description is mathematically quite superficial and is completely
inadequate for the purposes of discussing properties of the integers.
Clearly all of the
information we know about the relationships between the integers is lacking
in this description. The "meaning" of the integers is missing. It is
like giving a person an alphabet and rules for forming syntactically
correct words but not supplying  a dictionary which relates these words to
the person's vocabulary.

If pressed for details we might attempt a more adequate
characterization like the following:

.BEGIN GROUP;TABIT1(15);

\%21.%* %3zero%* is an element of %dN%*.

%aII:%*\%22.%* If %3n%* is in %dN%* then  the %3successor%* of %3n%* is in %dN%*.

\%23.%* The only elements of %dN%* are those created by %21%* and %22%*.
.END

Definition %aII%* appears to be completely at the other end of the spectrum;
it tells us very little about the appearance of  the integers. It gives us a
initial element %3zero%* and a mysterious operation  called %3successor%*,
which is supposed to exhibit a new element, given an old one.
And unless we are careful about the meaning of %3successor%*, definition %aII%*
will be  inadequate. For example if we define the
successor of an integer to be that same integer then %aII%* is satisfied
but unsatisfactory.

.BEGIN TURN ON "#";
How should we describe  %3successor%* so that our intentions are captured?
A sufficient way is to give a  definition of %3successor%* as a mapping
or function from integers to integers.
To give such an explicit definition requires some notation: let %30%*  be a
notation for %3zero%*; then
we define a function %2S%* such that the successor of
%3zero%*  --#called %3one%*#-- is denoted by %2S%3(0)%*, etc.
.END


The characterization  of decimal digits  given in %aI%*
is syntactic.
The notation itself tells us nothing about the interrelationships between the
integers, but it does give us a  notation for representing them.
Thus %32%* can be used to represent  %3two%* or 
used as an abbreviation for %2S%3(%2S%3(0))%1.
One benefit of the %2S%*-notation is that it explicitly shows the
means of construction. That is, it shows more of the properties of the
integers than just distinguishability.
We shall refer to the digit representation as %2numerals%* and reserve the
term, integer, for the abstract object.
Thus numerals denote, stand for, or represent the abstract objects called
integers. 

But notation and syntax are necessary and we must be able to give precise
descriptions of syntactic notions.  
Given a choice between the two previous definitions, %aI%* and %aII%*, 
it appears that %aII%* is more precise.
Much less is left to the imagination; given
%3zero%* and a definition of %3successor%* the definition will act as a
recipe for producing elements of %dN%*. This style of definition is called an
⊗→inductive definition↔←. The basic content of an inductive definition
of a set of objects consists of three parts: 

.BEGIN GROUP;
.P117:
.BEGIN INDENT 10,10;
(1) A description of an initial set of objects; the elements of this
set are to be included automatically in the set we are describing in the
inductive definition.
.END
%aIND%*
.BEGIN INDENT 10,10;
(2) Given the description of some existing elements
in the set, we are given a means of constructing more elements.  

(3) An extremal clause, saying that the only elements in the set are those
which gained admittance by either (1) or (2).
.END
.END

Clearly our definition of %dN%* in terms of %3zero%* and %3successor%*
is an instance of %aIND%*: we are defining the set of integers;
%3zero%* is initially included in the set; then applying the second phrase
of the definition we can say that %3one%* is in the set since %3one%* is the
successor of  %3zero%*.

We can recast the positional notation description as an inductive definition.

.BEGIN GROUP;TABIT1(15);

\%21.%* A numeral is a digit

\%22.%* if %3n%* is a numeral then %3n%* followed by a digit is a numeral.

\%23.%* The only numerals are those created by %21%* and %22%*.
.END

In words, "a numeral is a digit or a numeral followed by a digit".

In this application of %aIND%*, the initial set has more
than one element; namely the  ten decimal digits.
Also we assume that the questioner knows what "digit" means.
This is a characteristic of all definitions:
we must stop %2somewhere%* in our explication. 
Notice too that we assume  that "followed by" is juxtaposition.


Inductive definitions  have been the province of mathematics for
many years; however computer science has encroached a bit and has
developed a style of syntax specification 
called BNF (Backus-Naur Form) equations which has the same intent as
that of inductive definitions.  Here is the previous inductive definition
of "numeral" as a set of BNF equations:

.BEGIN TABIT1(15);

<numeral>\::= <digit>
<numeral>\::= <numeral><digit>

As an abbreviation, the  two BNF equations may also be written:

<numeral>\::= <digit>|<numeral><digit>,

.END
A comparison between the BNF and the inductive descriptions of
numeral  should clarify much of the notation, but to be complete
we give a thorough analysis.
The symbol "::=" is to be read "is a",
the symbol "|" is to be read "or";
and the sequence "><" is to be read "followed by".
The strings beginning with "<" and ending with ">" stem from the
references to "numeral" and "digit" in %21%1 and %22%*;
by convention, components of BNF equations which %2describe%* elements
are enclosed in "<" and ">"; and elements of the set which are given %2explicitly%*
are written without the "< >" fence.
Thus "<digit>" is not a numeral but is a description; to make the definition
of <numeral> complete we should include an equation like:

.BEGIN TABIT1(15);

<digit>\:: = %30 |1 |2 ... |9%*

.END

What should be remembered from the discussion in this section?
First we wanted and needed precise ways of describing the elements of our study 
on data structures. We have seen that inductive definitions are a powerful
way of describing sets of objects. We have seen a variant of inductive definitions
called Backus-Naur Form equations. We will use BNF equations to describe the
syntax of our data structures and our language.

Second, we  have  begun to  see the  difference  between an  abstract
object  and  its  representation.    This  distinction has  been  well
studied in philosophy and mathematics, and we will see that this idea
has strong consequences  for the field of  programming  and computer
science  in general.  Representation of  abstract objects will play  a
crucial role in this text. 
.SS(Symbolic Expressions: abstract data structures)

We now wish to apply the techniques and ideas which we have developed in
the previous section. We wish to show that  careful definitions and
use of abstraction will benefit the study of data structures and LISP.
To begin our study we should therefore characterize the domain
of LISP data structures in a manner similar to what we did for numbers.

Our objects are called  %2Symbolic Expressions%*.
Our domain of  Symbolic Expressions is named %d<sexpr>%*.
⊗→Symbolic expressions↔← are also known as  %6⊗→S-expressions↔←%*
or %6⊗→S-exprs↔←%*.  

.BEGIN TURN ON "#";
The set of symbolic expressions is defined inductively over a base set
named %d<atom>%*. 
The set %d<atom>%* can itself be defined inductively. We give the BNF equations
for elements of %d<atom>%* below, but the essential character of the
domain is  that it contains two kinds of objects: the %2⊗→literal atoms↔←%*
and the signed numerals.
The elements of %d<atom>%* are called %2atoms%*.
.END

.GROUP SKIP 2;
.BEGIN TABIT1(15);
.GROUP;
.P115:
<atom>\:: = <literal atom>|<numeral>| -<numeral>
<literal atom>\:: = <atom letter>|<literal atom><atom letter>|<literal atom><digit>
<numeral>\:: = <digit>|<numeral><digit>
<atom letter>\:: =%3 A |B |C ...| Z%*
<digit>\:: = %30 |1 |2 ... |9%*
.APART;
.END;
Thus a literal atom is a string of uppercase letters and digits, subject
to the provision that the %2first%* character in the atom be a letter.


.GROUP SKIP 2;
.BEGIN TABIT2(20,35);GROUP;
For example:\atoms\not atoms
%3\ABC123\2a
\12\a
\A4D6\$$g
\NIL\ABD.
\T\(A . B)
%*
.group skip 1;
.END;

The characteristics of atoms which most interest us are their distinguishability:
the atom %3ABC%* is distinguishable from the atom %3AB%*. That the string 
%3AB%* is a
part of the string %3ABC%* is not germane to our current discussion⊗↓However, we
will discuss such topics in {yonss(P27)} on string processing.←.
Similarly we will seldom need to exploit numerical relationships underlying
the numerals. At best we will use simple counting properties.
Thus most of our discussions will deal with non-numeric atoms.
Most implementations
of LISP do however contain a large arithmetic entourage, including 
floating point numbers and operations. Many implementations also
give a wider class of literal atoms, allowing some special characters to appear;
for most of our discussion the above class is quite sufficient.

The domain of Symbolic expressions, called %d<sexpr>%* is defined inductively over
the domain %d<atom>%*⊗↓We will not give the extremal clause, 
but it is assumed to hold.←.

.BEGIN INDENT 15,15;
%21.%* Any element of %d<atom>%* is an element of %d<sexpr>%*.

%22.%* If %9α%41%1 and %9α%42%1 are elements of %d<sexpr>%*, 
then the %2⊗→dotted-pair↔←%* %3(%9α%41%3.%9α%42%3)%1 is in %d<sexpr>%*.

.END
Thus %d<sexpr>%* includes %d<atom>%* as a proper subset. 
The notation we chose for the dotted-pairs is the following:
.BEGIN INDENT 0,10;
A dotted-pair consists of a left-parenthesis followed by a
S-expr, followed by a period, followed by an S-expr,
followed by a right-parenthesis.
.END

In the definition of %d<sexpr>%* we have introduced %9α%4i%1 as 
%2⊗→match-variable↔←s%*.
Greek letters %9α%* and %9β%* will be used throughout the text in several contexts
to designate pattern matches. For example, if we let %9α%41%1 be
%312%* and let %9α%42%1 be %3ABC%* then %3(%9α%41%3.%9α%42%3)%1 is
%3(12#.#ABC)%* or if we let %3(A#.(B#.#C))%* be %3(%9α%*#.#%9β%3)%1 then
%9α%1 is %3A%* and %9β%* is %3(B#.#C)%1.

Finally here's a BNF description of the full set of S-expressions.
.BEGIN TURN ON "←";
.P47:

←<sexpr> :: = <atom>|%3(%*<sexpr> . <sexpr>%3)

.END;

Notice that if we allow floating point numbers  as atoms some care needs to be
exercised when writing S-expressions. How should %3(A.1.2)%* be interpreted?
Is it the dotted pair %3(A . 1.2)%*, or is it just an ill-formed expression?
Evaluation of such ambiguous constructs will depend on the implementation; such
details do not interest us yet.

.<<Examples: s-exprs not sexprs>>
.BEGIN TABIT2(20,40);
.GROUP;

Examples:\S-exprs\not S-exprs
%3
\A\A . B
\(A . B)\(A . B . C)
\(((A.B) .C) . (A.B))\((A . B)))
%*
.APART;
.END;
Recall  our caveat on numerals and numbers. It also applies here. When we
described  the domain, %d<sexpr>%*, 
 we picked a %2specific%* syntactic representation for its elements.
It will be a
convenient notation since it makes explicit  the construction of the composite
S-expr from
its components⊗↓Just as the "successor" notation shows  the  construction
of  the numbers from %30%*. This kind of  notation will  be much more useful
in LISP, since our interest in data structures will  focus on the 
construction process and the interrelationships between components of a
S-expr.←,
and the  notation is also consistent with LISP history.

However there is more to the domain, %d<sexpr>%*, than  syntax;
just as there is more to %dN%* than positional notation⊗↓%32%*,
II in Roman numerals, 10 in binary, ... are all 
representations of the number two.←.
What %2are%* the essential features of S-expressions? 
Symbolic expressions are either
atomic or they have two components. 
If we  are confronted with a non-atomic S-expression
then we want a means of distinguishing between the
"first" and the "second" component. 
The "dot notation" does this for us, but 
obviously "(", ")", and "."  of the dotted-pairs are simply
notation or syntax.  We could have just as well represented
the  dotted-pair of %3A%* and %3B%*
as the set-theoretic ordered pair, %3<A,B>%*, or
any other notation which preserves the essentials of the domain %d<sexpr>%*.

.BEGIN TURN ON "#";
The distinctions between abstract objects and
their representation 
are quite important. As we continue our study of more and more complex
data structures the  use of an abstract data structure instead of one
of its representations can mean the difference between a clear and clean program
and a confusing and complicated program. 
There are similar gains for us when we study algorithms defined over these
abstract data structures. The less the algorithm knows about the representation
of the data structure, the easier it will be to modify or understand that
algorithm. Indeed you  may have
already experienced this phenomenon if you have programmed.
A program written in a high-level language is almost always more understandable
than its machine-language counterpart --#the benefits of a high-level language
are primarily notational rather than computational. The high-level program
is more abstract whereas the machine-language program knows a great deal about 
representations.
Finally, if you still doubt that representations
make a difference in clarity, try doing long division in Roman numerals.
We will  say much more about abstraction and representation in algorithms and
data structures as we proceed.
.END
.SS(Trees: representations of Symbolic expressions)
Besides the more conventional typographical notations, S-expressions
also have interesting %2graphical%*  representations.
S-exprs have a natural
interpretation as a  structure which we call a LISP-tree or L-tree.
.GROUP SKIP 1;
.GROUP;
Here are some L-trees:

.BEGIN SELECT g;NOFILL;PREFACE 0 MILLS;
.TURN OFF "%","#";
.TURN ON "∞" FOR "%";
             /\				    /\ 
          L /  \ R                         /  \R
           /    \			 L/   /\
        L /\R  L/\R                      /  L/  \R
         /  \  /  \			∞3A∞*   ∞3B∞*   ∞3NIL∞*
        ∞31∞*   ∞32∞*  ∞3A∞* L/\R
                 ∞3D∞*  ∞3E∞* 
.END

.APART;
.select 1;
We can give an inductive definition:
.BEGIN INDENT 15,15;
%21.%* Any labelled terminal node is a L-tree. A labelled terminal node
is a symbol, %9.%*, with an element of %d<atom>%* as a subscript.
For example %9.%4ABC%1.

%22.%* If %3n%41%1 and %3n%42%1 are L-trees then attaching the tails
of the arrows, %9L%* and %9R%* to %9.%* and the heads of the arrows to
%3n%41%1 and %3n%42%1 also forms an L-tree.
.END
Most important:  there are no intersecting branches.  We will talk about more
general structures later (called list-structures or directed graphs).

You can see how to interpret S-exprs as L-trees.  The atoms are
interpreted as terminal nodes;  and since non-atomic S-exprs always
have two sub-expressions we can write the first
sub-expression as the left branch of an L-tree and the second sub-
expression as the right branch.  
Typically we, leave off the %4L%* (left) and %4R%* (right) subscripts since it is clear
from context which they are.
For example:

.BEGIN SELECT g;NOFILL;PREFACE 0 MILLS;GROUP;
.TURN OFF "%","#";
.TURN ON "∞" FOR "%";
.TURN ON "?" FOR "\";
.TABS 30,55;
     ∞3(A . B)∞*      ?∞3(A . (B . C))∞*    ?∞3((A . B) . C)∞*

       /\	      ?	    /\                ?     /\
      /  \ 	      ?	   /  \               ?    /  \
     ∞3A∞*   ∞3B∞*    ?	  /   /\              ?   /\   \ 
		      ?	 /   /  \             ?  /  \   \ 
		      ?	∞3A∞*   ∞3B∞*   ∞3C∞* ?	∞3A∞*   ∞3B∞*   ∞3C∞*
.END

.GROUP
.SELECT 1;
.P102:
Other representations of LISP-trees which will be more suggestive when
we talk about implementation of LISP are:
%3

.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;
.TURN OFF "%","#","α";
.TURN ON "∞" FOR "%";
		        ⊂αααπααα⊃            ⊂αααπααα⊃
		  ααααα→~ # ~ #αβααααααααααα→~ # ~ # ~
		        %αβα∀ααα$            %αβα∀αβα$
		          ↓                    ↓   ↓
		          ∞3A∞*                    ∞3B∞*   ∞3C∞*

.END
.APART
.GROUP
%1or equivalently:
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;
.TURN OFF "%","#","α";
.TURN ON "∞" FOR "%";
		        ⊂αααπααα⊃            ⊂αααπααα⊃
		  ααααα→~ A ~ #αβααααααααααα→~ B ~ C ~
		        %ααα∀ααα$            %ααα∀ααα$
.END


.APART
.SELECT 1;
These last  representations are called %2⊗→box-notation↔←%*.

Again please keep in mind the distinction between the abstract object
referred to as a symbolic expression  and the several representations
which we have shown.

The question of representation is so important and will occur so frequently that
we introduce notation for a representational mapping, %9r%*.
To represent domain %dD%1 in domain %dE%1, we will define a
function %9r%4D→E%1 which usually will be inductively given but in any event
will express the desired mapping.

For example a representational  mapping %9r%4<sexpr>→L-tree%1 can be given:
.BEGIN CENTERIT;
←%9r%f(%1<atom>%f)%1 = %9.%4<atom>%1
←and for %9α%* and %9β%* in %d<sexpr>%*:
.END
.BEGIN SELECT g;NOFILL;PREFACE 0 MILLS;GROUP;
.TURN OFF "%","#";
.TURN ON "?" FOR "\"
.TURN ON "∞" FOR "%";
.TABS 25,37;
?∞9r∞f(∞3(∞9α∞* . ∞9β∞*)∞f)∞1 =∞G?     /\
??    /  \
??   /    \
??∞9r∞f(∞9α∞f)∞g   ∞9r∞f(∞9β∞f)∞g
.END

Typically context will determine the appropriate subscript on the %9r%1-mapping;
thus we will omit it.
.REQUIRE "PROB1.PUB" SOURCE_FILE;
.SS(Primitive Functions)
%1
So far we have described the domain of abstract objects called Symbolic
Expressions and have exhibited several representations for these objects.
We will now describe some functions or operations to be performed on this domain.
We need to be a bit careful here. We are about to see one of the main
differences between mathematics and computer science: mathematics
empasizes the idea of function; computer science emphasizes the idea of
algorithm, process, or procedure.

.BEGIN TURN OFF "{,}";TURN ON "#","←";
What is a function? Mathematically a function is simply a
mapping such that for any given argument in the domain of the function there
exists a unique corresponding value. 
In elementary set theory, a definition of function %3f%* involves saying that %3f%*
is a set of ordered pairs %3f#=#{#<x%41%*,#y%41%*>,#...}%1; the %3x%4i%1's are
all distinct and the value of the function %3f%* for an argument %3x%4i%1 is
defined to be the corresponding %3y%4i%1.
With either definition no rule of computation is given to help
locate values; with the first definition it is implicit that the internal structure
of the mapping doesn't matter; in the set-theoretic definition, the correspondence
is explicitly given.

An algorithm or procedure is a process for
computing values for a function. The factorial function, %3n!%*,
can be computed by many different algorithms; but thought of as a function it is
a set 

←%3{<0,1>, <1,1>, <2,2>, <3,6>, ...<n,n!>,  ...}%1
.END
For the initial rash of primitives, the distinction between function and algorithm
will not be too apparent. However we will soon remedy that oversight.

.BEGIN TURN ON "#";
Now for some terminology:
The domain of a function is the set of all values for which the function is 
defined; the range of a function is the set of all values which the function
takes on.
A careful definition of a function, %3f%*, requires specification of a 
further set; for lack of a better name, call it the domain of discourse. 
The domain of discourse, named %aD%* for lack of imagination, 
consists of all possible values which may occur
as the argument to a function.
If the domain of a particular function %3f%* coincides with %aD%* then %3f%*
is said to be a %2⊗→total function↔←%* (over %aD%*); if there are 
elements of %aD%* which are not in the domain of %3f%* then %3f%* is a 
⊗→partial function↔←, and %3f%* is said to be undefined for those values.
A substantive decision needs to be made on how to handle partial functions.
Since we are attempting to be reasonably realistic about our modelling of
computation we should be as precise as possible in our formalism.
We could introduce a class of error values and include them in the range of
%3f%*; these values would be given as the result of applying %3f%*
to an argument not in its domain; or
we could simply say that the result is "unspecified"
⊗↓Indeed, how "undefined" manifests itself on  a machine will depend on the implementation.
Sometimes error messages are given; sometimes it results in an excursion into
the subconscious.←. 
We shall pick an intermediate position; 
we shall introduce %2one%* new element into the discussion. This element,
%9B%*, called "unspecified" or "undefined", or "bottom"⊗↓"bottom" is
sometimes written %9W%*.←, will represent the
value of a partial function applied to an element not in its domain.
We will simultaneously extend all our functions so that they are now defined
over this augmented domain; 
thus constructs like %3f(a)#=#%9B%1 and %3f(%9B%*)#=#a%1 are allowed.

.BEGIN TURN OFF "}","{";
Remember, however, that the concept of "total or partial function" is relative
to a specified domain of discourse. 
A function which is partial over some set %aD%* 
is undefined for some arguments.
But that function can be extended to
a total function over the augmented domain %aD%1∪{%9B%1} by 
including %9B%* in the range of the function and assigning
%9B%* as the value for %3f%* where %3f%* is undefined on %aD%*, and
assigning a value in %aD%*∪{%9B%1} for %3f(%9B%*)%1.
As we define new  data structures we will frequently want to extend
our functions to larger domains. For our purposes, 
a function %3f%* defined on %aD%*∪{%9B%1} can be extended to 
include a larger domain, 
%aD%41%1, by defining %3f(d%41%*)#=#f(%9B%*)%1 for %3d%41%9ε%aD%41%1.
.END
A final piece of information: many of the functions which we will examine
are defined such that %3f(...,%9B%*,#...)#=#%9B%1; that is %3f%* returns
the undefined value whenever any of its arguments are undefined.
Functions which possess this property are called %2⊗→strict function↔←%*s.
We will see ({yon(P112)}) that a viable theory of computation must involve %2non%*-strict
functions as well as strict functions.
.END

Finally, to apply this discussion of %9B%* to LISP we will define an
extended domain %aS%* to be:
.BEGIN CENTER;TURN OFF "}","{";

%AS%* =  %d<sexpr>%*#∪#{%9B%1}.
.END

Then we can talk about functions which are total over %aS%* or %d<sexpr>%*,
and we will talk about functions which are partial over %d<sexpr>%*.
Typically when we ask if a function of symbolic expressions is partial or 
total without specifying
a domain, we are asking the question over the natural, unextended domain, %d<sexpr>%*.

All this business about "undefined" may seem a bit much; 
however part of the business
of computer science is to give a clear understanding of computation. The
care we take now will soon pay dividends when we study non-termination of
computations and when we scrutinize the differences between "function" and
"algorithm".


The first LISP function we consider is the
⊗→%3cons%*↔← function which is used to generate S-exprs from less
complicated S-exprs.  %3cons%*  called a %2constructor%*-function and
is a  strict, binary function; it is a total function
over the domain %d<sexpr>%*. Thus %3cons%* expects two arguments and 
 gives %9B%* as value whenever either of its arguments is
%9B%*.
Whenever %3cons%* is presented with two elements %9α%* and %9β%* of %d<sexpr>%* 
%3cons[%9α%*;%9β%*]%1 returns  a new S-expr %3(%9α%*#.#%9β%*)%1. That is,
interpreted as a LISP-tree, %3cons[%9α%*;%9β%*]%1 has a left branch %9α%*
and has a right branch of %9β%*.  For example:
 
.BEGIN CENTERIT;SELECT 3;

←cons[A; B] = (A . B)
←cons[(A . B); C] = ((A . B) .C)

.END;
Expressions like the above, which can  be evaluated, are called forms.
Constants are therefore forms: the value of a constant is that constant.
Function applications are forms: perform the designated function on the
designated arguments.

.P127:
Notice that we are designating function application in LISP by 
"function name, followed by a list of arguments delimited by `[' and
`]'⊗↓The syntax equations for forms are given
on {yon(P66)}.←."  The `[...]'-notation is part of the LISP syntax and we will reserve
`(...)'-notation for the function application of mathematics. 
In a few places in our discussions the distinction will be important.
Typically the distinctions will occur when we wish to distinguish between
the LISP algorithm and the mathematical function computed by that algorithm.
.<<car and cdr>>

We have two strict, unary selector functions, %3car%* and %3cdr%*⊗↓These strange
names are hold-overs from the original implementation of LISP on an ancient
IBM 704. That machine had partial-word instructions to reference the
%da%1ddress and %dd%1ecrement parts of a machine location. The %3a%* of
%3car%* comes from "address", the %3d%* of %3cdr%* comes from "decrement".
The %3c%* and %3r%* come form "contents of" and "register".
Thus %3car%* could be read "contents of address register".←, 
for traversing LISP-trees.
We already know the meaning of "strict"; a unary function expects %2one%* argument;
and a selector function is a data structure manipulating function which
will select a component of a composite data structure.
%3car%* and %3cdr%* are selectors since they will select components of
non-atomic elements of %d<sexpr>%*.
Thus %3car%* and %3cdr%* are both ⊗→partial functions↔← over %d<sexpr>%*;
they give values in %d<sexpr>%* only for non-atomic arguments;
they give %9B%* whenever they are presented with an atomic argument.

When given a non-atomic argument, %3(%9α%* . %9β%*)%1,
⊗→%3car%*↔← returns as value the first subexpression, %9α%1;
⊗→%3cdr%*↔← 
(pronounced could-er) returns as value the second sub-expression %9β%1.
For example:
.GROUP SKIP 1;
.BEGIN CENTERIT;
.GROUP;
←%3car[(A . B)] = A        car[A] = %9B%*.
←%3cdr[(A . B)] = B       cdr[(A .(B . C))] = (B . C)
←car[((A . B) . C)] = (A . B)
%*
.APART;
.END;
.GROUP SKIP 1;
.GROUP;
As with most mathematical theories, we will allow ⊗→functional composition↔←.
The composition of two unary functions %3f%* and %3g%* is another
function, sometimes denoted by %3f⊗g%*. The value of an expression, %3f⊗g[x]%*,
is the value of %3f[g[x]]%*.
That is, the value of %3f⊗g[x]%* is a %3z%* 
such that %3y%* is the value of %3g[x]%*
and %3z%* is the value of %3f[y]%*. %3f⊗g%* may be undefined for several reasons:
%3g[x]%* may be undefined, or %3f[y]%* may be undefined.
Here are some examples of composition:

.GROUP SKIP 1;
.BEGIN CENTERIT;
%3
←car⊗cdr[(A .(B . C))] = car[cdr[(A .(B . C))]] = car[(B . C)] =  B 
←cdr⊗cdr[(A .(C . B))] = cdr[cdr[(A .(C . B))]] = cdr[(C . B)] =  B
←cdr[cdr[A]] = %9B%*
←car[cdr[(A . B)]] = %9B%*
←car[cons[x;A]] = x     cdr[cons[Y;y]] = y .
%*
.END;
.APART

Notice  that the compositions which give result %9B%* do so by resorting
to our characterization of %3car%* and %3cdr%* as strict functions which
have been extended to %aS%*.

Notice too that in the last two examples we have introduced variables (over
S-exprs).  In the sequel lower-case 
identifiers⊗↓See {yon(P66)} for the BNF equations for <indentifier>.← will be
used freely as  variables.  So for example %3Y%* is an atom; %3y%* is a variable.
Be clear on the distinction between LISP variables like %3x, y%* or %3foo%*,
and the match variables⊗↓%1also called meta-variables← like %9α%* or %9β%*.
For %9α%* and %9β%* ranging over S-expressions, %3(%9α%*#.#%9β%*)%1 is a
well formed S-expr; %3(x#.#y)%* is %2not%* well-formed even though %3x%*
and %3y%* may evalute to S-expressions.

The composition of many %3car%* and %3cdr%* functions occurs so frequently that an
abbreviation has been developed.  
Given such a composition, we select  in left-to-right order, 
the relevant %3a%*s and %3d%*s in the %3car%*s and %3cdr%*s. We sandwich this
string of %3a%*s and %3d%*s between a left-hand %3c%* and a right-hand %3r%*
and give the composition this name.

.BEGIN CENTERIT;GROUP;
For example:
%3
←cadr[x] <= car[cdr[x]]
←caddr[x] <= car[cdr[cdr[x]]]
←cdar[x] <= cdr[car[x]]
%*
.END;
These compositions are also called ⊗→%3car-cdr-%2chains%1↔←, and are useful in
traversing LISP-trees. The "<="- notation is to be read "is defined to be the 
function ...".
This notation is only a temporary convenience and not part of LISP.
Very soon we will study what is involved in giving and using definitions in LISP
({yonss(P49)}).
For the moment intuition should suffice.

.<<discussion of components of function def>>
It %2is%* useful now, however, to introduce some terminology for talking
about the components of a function definitions. Let
.BEGIN CENTERIT; SELECT 3;
←f[x%41%*; ...; x%4n%*] <= %9x%3[x%41%3; ... ;x%4n%*]
.END
represent a typical definition. The %2name%* of the function is %3f%*;
the %2body%* of %3f%* is the expression on the right-hand side:
%9x%3[x%41%3; ... ;x%4n%*]%1. The list of variables appearing after the name
%3f%* are called the %2⊗→formal parameters↔←%*. The list of variables appearing
after the %9x%1 are to mean that those variables appear in  the expression.

A function is %2applied%*  using the common
notation of %2⊗→function application↔←%1:
.BEGIN CENTERIT; SELECT 3;
←f[a%41%*; ...; a%4n%*]
.END
The %3a%4i%1's are called %2⊗→actual parameters↔←%1; they must agree in
number with the formal parameters of the definition of %3f%* and they
are to be associated in a one-for-one order, %3x%4i%1 with %3a%4i%1.
Thus in the expression %3car[cdr[(A#.#B)]]%* the actual parameter
to the %3car%* function is %3cdr[(A#.#B)]%*, and the actual parameter to
%3cdr%* is %3(A#.#B)%*.
How the actual parameters are to be treated will be a large part of our 
study.

.BEGIN TURN ON "#";
Once again we are getting a hint of differences between mathematics
and computer science.
Mathematically speaking, a composition of functions is simply another function
--#i.e.,#a mapping#-- and therefore nothing need be said about how to compute
composed functions. From a computational point of view, 
we want to  express evaluation
of expressions involving composed functions in terms of the evaluation of
subexpressions. This would allow us to describe a complex computation in terms
of an appropriate sequence of subsidiary computations.
One of the more natural ways to evaluate expressions involving compositions is
to evaluate the inner-most expressions first, then work outwards. 
Assume arguments to multi-argument
functions are evaluated in left-to-right order. Thus:
.END

.BEGIN SELECT 3;GROUP;TABIT2(10,45);

\cons[car[(A . B)];cdr[(A . (1 . 2))]]\=  cons[A;cdr[(A . (1 . 2))]]  
\\=  cons[A;(1 . 2)] 
\\=  (A .(1 . 2))
.END

.P106:
Evaluation may seem to be a simple operation but
looks can be deceiving; evaluation is a very complex process.
The value of an expression may depend on the %2order%* in which we do things. 
For example consider
the evaluation of %3foo[t%41%*;t%42%*]%1 where %3foo[x;y]#<=#y%1.
We may run into difficulties if we insist on 
using the following scheme:
.BEGIN TABIT1(7);
.P121:

%aCBV%*\Evaluate the arguments to a function before calling the function.
.END

This scheme, %aCBV%*, is what we were informally using to evaluate 
the previous examples. However, in %2this%* example 
the computation of the value of %3t%41%1 may involve an undefined expression,
like taking %3car%* of an atom for example.
If we expect %3foo%* to be a strict function, then %3foo[t%41%*;t%42%*]%1
is expected to return %9B%*
even though it is reasonable to believe that since
the definition of %3foo%* does not depend on %3x%* then the value
of %3foo%* should %2not%* depend on the computation of %3t%41%1.

Or consider the seemingly innocous example:

.BEGIN CENTERIT;SELECT 3;
←car[cons[x;y]] =x.
%1It is a simple variant to:%*
←car[cons[x;A]] =x
.END
.BEGIN TURN ON "#";
The second equation holds for all values of %3x%*. 
It is an identity. Even if %3x%* is undefined,
it holds.
We are not so fortunate in the first example. For
if the computation of the argument %3y%* yields %9B%* then the left-hand
side of the equation will yield %9B%* since %3car%* and %3cdr%* are strict
functions.
The  right-hand side of the equation will obviously give the value of %3x%*:
defined if %3x%* is defined; %9B%* otherwise. In neither case will the
value depend on the computation of %3y%*.
We can overcome the previous irritations by resorting to a different
scheme for evaluating expressions:
.BEGIN TABIT1(7);GROUP;

.P126:
%ACBN%*\Substitute the unevaluated arguments into the body of the function.
.END
Using %aCBN%* then,
%3foo[car[A];t%42%*]#%1is%*#t%42%1 and will be independent of 
whatever happened to the
first argument.
We will meet more computational difficulties later.
On {Yon(P112)} we will see another computational catastrophe, non-terminating
computations, which will cause problems for evaluation schemes.
What should be kept in mind from this discussion is that we must be
careful when discussing the process of evaluation; the function we are
characterizing by  computing its values will often depend on our choice
of evaluation scheme. In {yonsec(P113)} we will discuss evaluation techniques
and will give a precise characterization of the evaluation of LISP expressions.
.END

Before introducing a further class of LISP expressions we summarize
the
syntax of the LISP expressions (or forms) allowed so far:

.BEGIN TABIT1(13);GROUP;
.P66:

<form>\::= <constant> | <function>[<arg>; ...;<arg>] |  <variable>

<constant>\::= <sexpr>  

<function>\::= <identifier>

<arg>\::= <form>

<variable>\::= <identifier>

<identifier>\::= <letter> | <identifier><letter> | <identifier><digit>

<letter>\::= %3a | b | ... | z  

.END
.P130:
The ellipses in the first equation are not part of BNF syntax. It is
an abbreviation meaning "zero or more occurrences".
Thus the equation means a <form> is a <function> followed by the symbol "["
followed by zero or more <arg>'s followed by the symbol "]".
This use of ellipses can always be replaced by a sequence of BNF equations. 

To increase lucidity
we will frequently violate these syntax equations, allowing function names
containing special characters, e.g. %3fact*%*, %3fib%9'%1 or + ; or writing %3x +y%*
instead of %3+[x;y]%*.  No attempt will be made to characterize these violations;
occurrences of them should be clear from context.

.GROUP;
Notice that the class <⊗→form↔←> is a collection of LISP expressions which can
be evaluated. A <form> is either:

.BEGIN INDENT 5,5;
%21.%* a constant: the value is that constant

%22.%* a function name followed by zero or more arguments: we've said a bit about 
evaluation schemes for these constructs.

%23.%* a variable: a variable in LISP will typically have an associated value. 
.END
.APART
Again we will wait to {yonss(P49)} for a precise description.

.P111:
An important constraint on LISP forms which is not covered by the syntax
equations is the requirement that functions are defined as being n-ary for
some %2fixed%* n. n-ary functions
must have %2exactly%* n arguments presented to them whenever they are applied.
.P95:
Thus %3cons[A], cons[A;B;C], %*and %3car[A;B]%* are all ill-formed expressions
and are therefore undefined.

.CENT(Problems)

I.  Discuss %3cons[car[x];cdr[x]] = x%1.
II. Discuss %3cons[car[%9α%*];cdr[%9α%*]] = %9α%1.
.SS(Predicates and Conditional Expressions,conditional expression)
.BEGIN TURN ON "#";
We cannot generate a very exciting theory based simply on %3car, cdr,%* and %3cons
%*
with functional composition. Before we can write reasonably interesting 
algorithms we must have some way of performing conditional actions.
To do this we first need ⊗→predicates↔←. A LISP predicate is a function
returning a value representing truth or falsity.  We will represent the
concepts of true and false by %et%* and %ef%* respectively. Since these truth
values are distinct from elements of %aS%*, we will set up a new domain %aTr%*
which will consist of the elements, %et%*, %ef%*, and %9B%4Tr%1. 
The extra element %9B%4Tr%1 is
included so that we may talk about partial predicates just as we 
talked about partial functions on %d<sexpr>%*. Thus to be
ultra-precise we should always subscript %9B%* but it is usually clear from
context which "undefined" element  we are talking about⊗↓A word for the 
inveterate LISP hacker: our use of %et%* and %ef%* marks the first major
break from LISP folklore. 
The typical LISP trick is to map true to the atom %3T%* and false 
to the atom %3NIL%*. 
Our heresy will disallow some mixed compositions of LISP
functions and predicates.  We shall not apologize for that
and by the end of this chapter you should see
why we  have deviated. A final word towards reconciliation: 
by the time we are ready to run on a machine
our heresy will have been purified.←.
.END

LISP has two primitive predicates.
The first is a strict unary predicate named ⊗→%3atom%*↔←;
%3atom%* is total over %d<sexpr>%*, and
is a special kind of predicate called a ⊗→recognizer↔← or a discriminator.
Recognizers are used to determine the type of an instance of a data structure.
Thus %3atom%* will return %et%* if the argument is an atom, and will return %ef%*
if the argument is a non-atomic s-expression.
.GROUP SKIP 1;
.BEGIN CENTERIT;
.GROUP
%3
←atom[A] = atom[NIL] = %et%*
←atom[(A . B)]  = %ef%*
←atom[car[(A . B)]]  = %et%*
←atom[%9B%4S%3] = %9B%4Tr%3
←atom[atom[A]] %1 is undefined since %et%1 is not an element of %aS%*.
%*
.APART
.END

The second primitive predicate is named ⊗→%3eq%*↔←.  
It is a strict binary predicate, partial over the set %d<sexpr>%*;
it will give a defined value only if  its arguments are both atomic.  
It returns %et%* if the arguments evaluate to the
same atom; it returns %ef%* otherwise.
.BEGIN CENTERIT;
%3
.GROUP SKIP 1;
.GROUP
←eq[A;A] = %et%*		eq[A;B] = %ef%*
←eq[(A . B); A] = %9B%4Tr%3       %3eq[(A . B);(A . B)] = %9B%4Tr%3
←eq[eq[A;B];D] = %9B%4Tr%3     eq[%9B%4S%3;x] = %9B%4Tr%3
←eq[car[(A . B)];car[cdr[(A .(B . C))]]] = %ef%1
.APART;
.END
For completeness sake we should define a version of %3eq%*, say %3eq%4Tr%1, which
is defined over %aTr%* and acts like %3eq%*. For expediency's sake we will
simply extend the definition of %3eq%* so that it may compare two elements of
%aTr%* as well as compare two elements of %aS%*.

.BEGIN CENTERIT;
%3
.GROUP
←eq[%et%*;%et%*] = %et%*		eq[%ef%*;%9B%4Tr%3] = %9B%4Tr%3
←eq[%ef%*;%ef%*] = %et%*		eq[%et%*;%ef%*] = %ef%*
←eq[A;%et%*] %1 is undefined
.APART;
.END

Notice that we now have %2two%* separate domains: S-expressions and truth values.
Since we will be writing functions over several domains we will need a 
general recognizer for each domain to assure that the operations defined on each
abstract data structure are properly applied. Thus we introduce  the recognizer
%3issexpr%*. %3issexpr%* will give %et%* on the domain of S-exprs, and will
give %ef%* everywhere else.

.BEGIN CENTERIT;SELECT 3;GROUP;
←issexpr[(A . B)] = issexpr[A] = %et%*
←issexpr[%et%*] = %ef%*  issexpr[%9B%*] = %9B%*
.END

We need to include a construct in our
language to effect  a test-and-branch operation.
The IF-THEN-ELSE operation in LISP is called the conditional expression.  It
is written:
.BEGIN CENTERIT;
%3
←[p%41%* → e%41%*; p%42%* → e%42%*; ... ; p%4n%*  → e%4n%*]
%1
.END

.P40:
Each %3p%4i%1 is a predicate and therefore takes values in the
set %aTr%*; each %3e%4i%1 is an expression which will either give a value in
%aS%* or %aTr%*.

The meaning (or semantics) of conditionals is:  
.BEGIN INDENT 10,10;
We evaluate the %3p%4i%1's from left to right, finding the %2first%*
which returns value %et%*.  When we find such a %3p%4i%1, 
we evaluate the corresponding
%3e%4i%1.  
The value of the conditional expression is the value computed by that %3e%4i%1; 
if all
of the %3p%4i%1's evaluate to %ef%* then the conditional expression is 
undefined⊗↓Not
%9B%4Tr%1, not %9B%4S%1, but undefined since it is not clear %2which%* 
%9B%* is desired.←.  The
conditional expression  also gives %9B%4Tr%1 if we come across a %3p%4i%1
which  has value %9B%4Tr%1 before we hit a %3p%4i%1 with value %et%1.
And finally the value of the conditional is undefined if we come
to a %3p%4i%1 which is undefined as we scan from left to right.
.END
.BEGIN CENTERIT;
.GROUP;
Examples:
%3
←[atom [A] → B; eq [A;(A . B)] → C] = B
←[eq [A;(A . B)] → C; atom [A] → B] = %9B%4Tr%3
←[atom [(A . B)] → B; eq [A ; B] → C; eq [car[(A . B)]; cdr[(B . A)]] → E] = E
%1
.APART
.END
Notice that the p%42%* expression of the first example is undefined, but
the conditional gives value %3B%* since p%41%* gives value %et%*.


.BEGIN TURN ON "#";
Frequently it is convenient to use a special form of the conditional expression
where the final %3p%4n%1 is guaranteed to be true. 
You can think of lots of predicates which
are always true (for example, %3eq[1;1]%* ).  A natural predicate is the
constant predicate, %et%*; the value of %et%* is always %et%*.  
Thus the special form:
.END
.GROUP

.BEGIN CENTERIT;

%3
←[%3p%41%*  → %3e%41%*; ... %et%* → %3e%4n%*]
%1
.END
.BEGIN TURN ON "#";
Now if we know that the previous
%3p%4i%1's are either true or false, 
the final %3p%4n%1#→#%3e%4n%1-case is a catch-all
or otherwise-case which will be executed if none of the previous %3p%4i%1's
give %et%*.
Thus the use of %et%* in this context can be read "otherwise";
and the conditional can be read:
.END

.BEGIN CENTERIT;

←"If %3p%41%1 is true then %3e%41%1, else if %3p%42%1 is true then ..., otherwise %3e%4n%1."
.END

.APART;
.BEGIN TURN ON "#";
.P17:
The introduction of conditional expressions has further widened the gap between
traditional mathematical theories and computational theories. 
Previously we could almost side-step the issue of order of evaluation; 
it didn't really matter unless %9B%* got into the act. But now
the very definition of meaning of conditionals involves an order of evaluation. 
We should  ask just
how important %2is%* the order of evaluation in conditional expressions.

First the order of evaluation is important from a computational
viewpoint on the grounds of efficiency: if we are going to give as value
the leftmost %3e%4i%1 whose %3p%4i%1 evaluates to %et%*, then there is
no need to compute any of the other %3e%4j%1's; those values will never be used.
A more pressing difficulty is that of partial functions. If we did not
impose an order of evaluation on the components of a conditional then
frequently we would attempt to evaluate expressions which would lead
to undefined results: %3[eq[0;0]#→#%et%*;#%et%*#→#1/0]%1 gives %et%*
using the meaning of conditionals whereas the expression would be undefined 
if we were forced to evaluate %31/0%*.
.END

From the theoretical view things are not so bleak; we can 
handle the problem of undefined. If we continue to allow
%9B%* as an argument or value to a function, then we can characterize
the effect of a conditional expression as a %2⊗→non-strict↔←%* function.
A non-strict function is allowed to return a value other than %9B%* even
when one of its arguments is %9B%*.

Let %3cond(x;y;z)%1 be the function ⊗↓Notice we are 
writing `(...)' rather than `[...]' since we are talking
about the function and not the algorithm. See {yon(P127)}.← 
computed by: %3[x#→#y;#%et%*#→#z]%1 
and assume  we insist on evaluating all of the
constituents of  %3cond%* %2before%* we see which %3p%4i%1 has value %et%*.
then %3cond(0=0;#%et%*;#1/0)%1 would give value %9B%* if %3cond%* were strict rather
than giving value %et%*.
The situation can be saved if we were to define %3cond%* as a non-strict function
such that:

.BEGIN TABIT1(15);GROUP;
\%3y%* if %3x%* is %et%*
%3cond(x;y;z) = \z%1 if %3x%* is %ef%*
\%9B%* if %3x%* is %9B%*
.END

.P112:
We have now introduced  most of the machinery of LISP. We have a
good idea of the primitive data structures which LISP operates on and a
reasonable repertoire of functions and predicates. We have seen
some of the distinctions between functions and algorithms and now
are ready to introduce yet another difficulty: non-termination of computations.

Consider the following algorithm:

.BEGIN CENTERIT; SELECT 3;
←f[x] <= [x=0 → 1; %et%*  → f[x-1]]
.END
This algorithm defines a function giving %31%* for any non-negative integer and
is undefined for any other number. From a computational point, however,
%3f[-1]%* is "undefined" in a different sense than %3car[A]%* is
"undefined". For a partial function like %3car%* we can conceive
of giving an error message whenever we attempted to apply the function
to a spurious argument. But it stretches one's credulity to expect
to include tests like "if the  computation %3f[a]%* does not terminate then
give error No. 15."⊗↓Indeed, there are good reasons to be sceptical about
the existence of such omniscient tests.←. Again, from the purely
functional point of view, %3f%*  still defines the partial function
which is %31%* for the non-negative integers, regardless of how badly
it botches things up for negative numbers. Thus we will identify
computations which do not terminate with our element %9B%*;
and so a computation may be "undefined" for two reasons:
it involves a non-terminating computation; it involves applying a 
partial function to a value not in its domain.
We will discuss non-termination again in more detail in {yonsec(P113)}
where we will examine  schemes for evaluation.
A final point concerning the necessity of non-strict functions in a theory
of computation:
Consider a typical recursive definition  of a function, %3f%*.  
We claim that deep in  its black heart %3f%*
must  utilize   a non-strict function.
That is, a function which can determine  its value without complete information
concerning the values  of all  of its arguments  --a "don't care  condition"--.   
For otherwise the little beast won't terminate;  
it will continue to attempt to evaluate all arguments, ad nauseam. 

The conditional function is such  a non-strict function.
That is  %3cond(%et%*;q;r)%1  has value %3q%*  without
knowing  anything  about  what  happens  to  %3r%*.  
In particular,
%3cond(%et%*;q;%9B%*)#=#q%1.  
Likewise %3cond(%ef%*;%9B%*;r)#=#r%1.
Now since %3cond%* is to be a function and therefore single-valued,
if %3cond(%et%*;q;%9B%*)#=#q%1   then for any argument %3x%*,
%3cond(%et%*;q;x)#=#q%1.  
Thus %3cond(%et%*;x;y)%1 and  %3cond(%ef%*;y;x)%1 act like functions of the
form %3f(x;y)%*:

.BEGIN CENTERIT SELECT 3;
←f(x;%9B%*) = z %1 implies for %2every%1 %3y%* in the domain %3f(x;y)=z%1.
.END

Notice that %9B%* is now carrying an additional "don't-care" interpretation, 
certainly consistent 
with its previous assignments when we think of the function being computed
by the algorithm.

.BEGIN TURN ON "#";
So we %2could%* interpret conditional expressions as non-strict functions
and preserve the intended meaning of the construct; however such an
interpretation does a disservice to the computational needs of a practical
language. Pragmatics dictates that we evaluate no more expressions than
are necessary; previous examples (%3foo%*,#{yon(P106)}) show that we may 
have already violated
this dictum in special cases. To compound the felony by certifying the 
previous discussion as the meaning of conditionals would be most
unfortunate. The original computational description (or operational meaning)
of conditional expressions stands.
.END

.P131:
What benefits have resulted from our study of %9B%*?
We should have a clearer understanding of the difference between
function and algorithm and a better grasp of the kinds of difficulties
which can befall  an unwary computation.
But notice that
even after introducing %9B%* we have continued to call some computations
undefined. The character of these miscreants is that they occur in the
context of supplying the wrong %2kind%* of argument to a function. This
kind of error is called a %2⊗→type-fault↔←%*, meaning that we expected
an argument of a specific type and since it was not forthcoming, we
refuse to  perform any kind of calculation. 
Thus %3atom[%ef%*]%1 and %3cons[%et%*;A]%1 are undefined
since both expect elements of %aS%* as arguments. See {yon(P132)} for further
discussion of type-faults.

Now that the points have been made we will relax into a more
informal discussion of total vs. partial and function vs. algorithm.
For example when we ask if a LISP function is partial or total, what we
mean is "for what S-expr arguments is the algorithm which we are examining
defined?" We must hasten to add however that our leniency will %2not%*
be extended to type-faults; these are serious errors.

Even given that a computational definition is desired,
there are other plausible interpretations of conditionals.
Consider the nonsense definition:
%3g[x;y]#<=#[lic[x]#→#1;T#→#1]%*. Clearly, assuming that %3lic%* is a total predicate, 
any value computed by %3g%* will be %31%*. But requiring  left-to-right evaluation
could spend a great deal of unnecessary computation if %3lic%* is a 
%dl%*ong %di%*nvolved %dc%*alculation.
Questions of evaluation are non-trivial. You should at least be aware that some
decisions have been made and others were possible.

A word to the wise.  It doesn't seem like you can do much damage with such
a limited collection of operations as those proposed in LISP. 
Things are not quite as trivial as they might seem.
In elementary number
theory all you have is zero and some simple functions, and elementary number theory
is far from elementary.
Manipulation of our primitives, with composition, and conditional expressions, 
coupled with techniques for definition can %2also%* become complicated.

Let's apply the LISP constructs which we now have, and   define
a  new LISP function.
For example: our predicate %3eq%* is defined only for atomic arguments.  We would like
to be able to test for equality of arbitrary S-exprs.  
What should this more complex equality mean?
By equality we mean: as trees, the S-exprs have the same branching structure; and
the corresponding terminal nodes are labeled by the same atoms.
Thus, we would
like to define a predicate, ⊗→%3equal%*↔←, such that:
.GROUP SKIP 1;
.GROUP
.BEGIN CENTERIT;
%3
←equal [(A . B);(A . B)] = %et%* = equal [A;A]
←equal [(A . B);(B . A)] = %ef%*
←equal [(A . (B . C));(A . (B . C))] = %et%*

%1
.END
.APART

Here's an intuitive description of such a predicate named %3equal%*.

.BEGIN INDENT 0,4;
1.  If both arguments are atomic then see what %3eq%* says about them
(are they "%3eq%*").  We can test if they are both atomic by using %3atom%*
and a conditional expression.
.END

2.  If one is atomic and the other is not they can't be equal S-exprs.

.BEGIN INDENT 0,5;
3.  Otherwise both are non-atomic S-exprs.  Both have two sub-expressions.
Look at both first subexpressions.  If the first sub-expressions are not
equal then certainly the initial expressions cannot hope to be equal.  
If, however, the first subexpressions are equal then the question of
whether or not the initial expressions are equal depends on the
equality (or non-equality) of the second subexpressions.  Thus the
following definition:
.END
%3
.GROUP SKIP 1;
.BEGIN TABIT2(19,30);GROUP;
.P1:
     equal[x;y] <=\[atom[x] → [atom[y] → eq [x;y];
\\ %et%* → %ef%*];
\ atom[y] → %ef%*;
\ equal [car[x];car[y]] → equal[cdr[x];cdr[y]];
\ %et%* → %ef%*]

.END
%1
Notice that we use nested conditional expressions in %3equal%*; e%41%*
is itself a conditional. Also we have used predicates in the e%4i%* positions
at e%43%* and e%411%*; this is perfectly reasonable since %3equal%* is a predicate
and thus returns either %ef%* or %et%*. 

Let's show that %3equal%* does perform correctly for a specific example.  This will
also show a complicated evaluation of a conditional expression.

.BEGIN TABIT2(25,44);GROUP;SELECT 3;

equal[(A . B);(A . C)] =\[atom[(A . B)] → [atom[(A . C)] → eq [(A . B);(A . C)];
\\ %et%* → %ef%*];
\ atom[(A . C)] → %ef%*;
\ equal [car[(A . B)];car[(A . C)]] → equal[cdr[(A . B)];cdr[(A . C)]];
\ %et%* → %ef%*]
.APART
.BEGIN FILL;SELECT 1;
Now using the meaning of conditionals ({yon(P40)}),
we find that p%41%* (i.e.,#%3atom[(A#.#B)]%*#)
and p%42%* (#%3atom[(A#.#C)]%*#) when evaluated (in order) give %ef%*.
We must now evaluate p%43%*: %3equal[car[(A#.#B)];car[(A#.#C)]]%*.
This reduces to %3equal[A;A]%*, and:
.END

.GROUP;
   equal[A;A] =\[atom[A] → [atom[A] → eq[A;A];
\\ %et%* → %ef%*];
\ atom[A] → %ef%*;
\ equal [car[A];car[A]] → equal[cdr[A];cdr[A]];
\ %et%* → %ef%*]
.APART
.BEGIN FILL;SELECT 1;
This conditional expression will finally evaluate to %et%*. So p%43%* in the
original call of %3equal[(A#.#B);(A#.#C)]%* is true, and we are faced with the evaluation of e%43%*
which is %3equal[cdr[(A#.#B);cdr(A#.#C)]]%*. After evaluation of the arguments and 
evaluation of the conditional expression defining %3equal%* we will finally
return value %ef%*. That is, %3(A . B)%* and %3(A . C)%* are %2not%* equal.
Clearly evaluation of LISP expressions in this great detail is not a process
which we wish to do very often. It might perhaps be clear that such a process
is a likely candidate for execution by a machine.

Notice that throughout this example expressions like %3atom[(A#.#B)]%*
or %3eq[(A#.#B);(A#.#C)]%* were appearing but were never evaluated because
of the way in which we defined evaluation of conditionals.
.END
.END
Finally, to include conditional expressions in our syntax of LISP
expressions, we should add:

.BEGIN TABIT1(11);

<form>\::= [<form> → <form>; ... <form> → <form>]         (see {yon(P66)})

.END

As usual these syntax equations fail to capture all of our intended
meaning. The <form>s appearing in the p%4i%*-position are to be forms
taking values in %aTr%*, the truth domain.

.REQUIRE "PROB2.PUB" SOURCE_FILE;
.REQUIRE "PROB3.PUB" SOURCE_FILE;

.SS(Sequences: abstract data structures,,P100:)

.BEGIN TURN ON "#";
In several  areas of mathematics it is convenient  to deal
with  sequences  of information. That is, the problem domain  more naturally 
described as collections of numbers rather than individual numbers.
This may either simplfy understanding of the problem or 
simplify the formulation of the functions defined on the domain.
Thus several
common programming  languages  include  arrays as 
representations of  these mathematical ideas.  
First we should notice that sequences, or their representation as arrays
are data structures. We will have to describe constructors, selectors, and
recognizers for them. 
Also we will explore applications of sequences as data structures suitable for
representing many non-trivial problems in computer science.

After a certain familarity is gained in the application of algorithms which
manipulate sequences, we will discuss the problems of representation and 
implementation of this data structure. We will first give an implementation
of sequences in terms of S-expressions. That is we will describe an
%9r%*-mapping giving a representation of sequences and their primitive operations
in terms of LISP's S-exprs and primitive functions. Later in {yonss(P137)}
we will discuss low-level implementation of this data structure in terms of
conventional machines.

But first we will study sequences as abstract  data
structures:  what are  their  essential structural  characteristics?  
what properties should  be present in a programming language to allow
a natural  and flexible  representation?   This discussion  will shed
light on the important problems of representation and abstraction. 

A sequence is  an ordered set of elements⊗↓For an alternative description
on sequences and a discussion of a different view of data structures see
{yon(P147)}.←.
For example, %2(x%41%*, x%42%*, x%43%*)%1, is standard notation for 
a sequence of  the three elements, %2x%41%*, x%42%1, and %2x%43%1.
The length of a sequence is defined to be the number of elements in that sequence.
We will allow sequences to have
sub-sequences to an arbitrary finite depth.
That is, the elements of a sequence will either be individuals or may themselves be
sequences. 
For example a sequence of length %3n%*, each of whose elements are sequences of
length %3m%* is a matrix.
Here are BNF equations for sequences and their elements:

.P114:
.BEGIN TABIT1(15);
.GROUP;

<seq>\::= %2(%* <seq elem>, ...,<seq elem> %2)%* ⊗↓For the meaning of ellipses see
{yon(P130)}.←
<seq elem>\::= <indiv> | <seq>
<indiv>\:: = <literal indiv>|<numeral>| -<numeral>
<literal indiv>\:: = <indiv letter>|<literal indiv><indiv letter>|<literal indiv><digit>
<numeral>\:: = <digit>|<numeral><digit>
<indiv letter>\:: =%2 A |B |C ...| Z%*
<digit>\:: = %20 |1 |2 ... |9%*
.APART;
.END;

Notice that the structure of <indiv> is the same as that for LISP's <atom>;
the only difference is in the fonts used for letters and digits. We have
made the distinction between LISP atoms and sequence individuals intentionally.
Thus %2(A,#(B,#C),#D,#(E,#B))%* is a sequence of length four, whose second
and fourth elements are also sequences.  
We will use "%2()%*" as notation for the empty sequence.

We  want to write LISP-like functions operating  over sequences, so we will
at least  
need to give constructors, selectors and predicates for sequences⊗↓Ultimately 
we will want to give a representation for sequences as S-expressions, and
give representations for the sequence operations 
as operations on S-expressions.←. As in the case of Symbolic expressions,
we will include an undefined element; the element will be named %9B%4Seq%1 and
the full domain of sequences will be named  
.BEGIN CENTER; TURN OFF "{","}";
%ASeq%1 = %d<seq>%*∪{%9B%4Seq%1}
.END

What are the essential characteristics of a sequence? First, a sequence
is either empty or it has elements. Thus we will want a predicate to test for emptyness.
Next, if the sequence is non-empty, we should be able to select elements. Finally,
given some elements, we should be able to build a new sequence from them.

The basic predicate, which tests for emptyness, is called ⊗→%3null%*↔←.
Predicates on sequences are like predicates on S-expressions, only here 
we map sequences to truth values in %aTr%*⊗↓The 
reason for restructuring LISP predicates should now be apparent
to previous users of LISP: if we mapped the truth values to
the atoms %3T%* and %3NIL%* as is typically done, then we'd have to
map truth values of sequence-predicates to representation as sequences. Indeed
we would have to perpetuate the fraud for every new abstract data structure
domain that we wanted to introduce. Thus we did it right the first time.←.

.BEGIN CENTERIT;
.BEGIN TABIT2(10,20);

\%3null[x]%1 is \%et%* if %3x%* is the empty sequence, %2()%*.
\\%ef%* if %3x%* is a non-empty sequence.
.END

.BEGIN FILL
Thus %3null%* is  defined only for sequences.
Since we intend to operate on domains which contain data structures other than
sequences, we will need a ⊗→recognizer↔← to be sure that %3null%* is
not applied to arguments which are %2not%* sequences.
We will name this recognizer ⊗→%3isseq%*↔←.
.END

←%3isseq[%2(A, B, C)%*] = %et%*
←%3isseq[%2A%*] = %ef%1
←%3isseq[%3A%*] = %ef%1  %3isseq[%et%*] = %ef%1
←%3isseq[%2()%*] = %et%1

.BEGIN FILL;TURN ON "#";
Thus %3isseq%* is a total predicate over our intended domain, whereas
%3null%* is only partial.

While on the subject of predicates, there are a couple more we shall need.
The first one is a recognizer, %3isindiv%* which will give value %et%*
if its argument is an individual, give %ef%* if its argument is a sequence,
and will give %9B%4Seq%1 otherwise.

The second predicate is the extension of the equality relation to the class
of sequence individuals. We shall use the same name,  %3eq%* as 
we did for the S-expression  predicate.
Indeed, whenever we define a new abstract data type we will assume that
an appropriate version of %3eq%* is available for the elements of the base
domain. One of our first tasks will be to extend that equality relation to the whole
domain. We will do so for sequences later in this section. 
Equality is 
a basic relation in mathematics so it is not suprising to see it play an
important role here.
%3eq%* is one
of the few relations which we shall define across all domains. 
Functions or predicates like %3eq, isseq,%* and %3issexpr%*, which are 
applicable on several domains are called %2⊗→polymorphic functions↔←%*.


Next, the selectors for a (non-empty) sequence include: %3first%*, %3second%*,...#,etc,
where:
.END

←%3first[%2(A, B, C)%*] = %2A%*
←%3second[%2(A, B, C)%*] = %2B%*
←%3third[%2(A, B)%*] = %9B%1

.BEGIN FILL
It is also convenient to define an "all-but-first" selector, called ⊗→%3rest%*↔←.
.END

←%3rest[%2(A, B, C)%*] = %2(B, C)%*
←%3rest[%2(B, C)%*] = %2(C)%*
←%3rest[%2(C)%*] = %2()%*
←%3rest[%2C%*] = %9B%1

.BEGIN FILL
and, in conjunction with %3rest%*, we shall utilize a constructor, ⊗→%3concat%*↔←,
which is to  add a single element to the front of a sequence.
.END
←%3concat[%2A%*;%2(B,C)%*] = %2(A, B, C)%*
←%3concat[%2A%*;%2()%*] = %2(A)%*
←%3concat[%2(A)%*;%2(B,C)%*] = %2((A), B, C)%*
←%3concat[%2(B,C)%*;%2A%*] = %9B%1

.END
.SELECT 1;
The final constructor is called ⊗→%3seq%1↔←; it takes an arbitrary number of sequence
elements as arguments and returns a sequence consisting of those elements (in the
obvious order).
Let %9α%41%1,#...,#%9α%4n%1 be elements of <seq elem>, then:

.BEGIN CENTERIT;SELECT 3;

←seq[%9α%41%3; %9α%42%3; ...; %9α%4n%3] = %2(%9α%41%2, ..., %9α%4n%2)%1
.END

One question may have come to mind: how do we know when we have a sufficient set
of functions for the manipulation of an abstract data structure?
How do we know we haven't left some crucial functions out? If we have enough,
how do we know that we haven't included too many? 
Actually, this second case isn't disasterous, but when implementing the functions
it would be  nice to minimize the number of primitives we have to program.
Both of these problems are worthy of study and are the concern of anyone
interested in the design of programming languages. We will say a bit more
about solutions to these questions beginning on {yon(P116)}.

Notice that we have been describing the sequence-functions 
without regard to any underlying representation as functions
which manipulate S-expressions.
We have said nothing about these sequence operations except that they
construct, test, or select.
What we are doing is considering sequences as abstract data structures,
suitable for manipulation by LISP-like programs. How sequences are represented
on a machine or indeed as S-expressions, is irrelevant. Sequences have
certain inherent structural properties and it is those properties which
we must understand %2before%* we begin thinking about representation on a machine.
In the next section  we will show how to represent sequences 
as certain S-expressions and sequence operations will be representable as
LISP operations on that representation.
.END

First let's develop some expertise in manipulating sequences.
The first example will be our promised extension of the equality relation
to sequences. We perpetuate the name %3equal%* from S-exprs, and the basic
structure of the definition will parallel that of its namesake; but
the components of the definition will involve sequence operations rather
than S-expr operations. It might be of value to compare the two predicates.
The S-expr version is to be found on {yon(P1)}.

.BEGIN TABIT2(19,30);GROUP;SELECT 3;

     equal[x;y] <=\[null[x] → [null[y] → %et%*;
\\ %et%* → %ef%*];
\ null[y] → %ef%*;
\ equal%9'%* [first[x];first[y]] → equal[rest[x];rest[y]];
\ %et%* → %ef%*]
.END

.BEGIN TABIT2(19,30);GROUP;SELECT 3;
     equal%9'%*[x;y] <=\[indiv[x] → [indiv[y] → eq[x;y];
\\ %et%* → %ef%*];
\ indiv[y] → %ef%*;
\ %et%* → equal[x;y]]
.END

Next, we will write a predicate, %3member%* of two arguments, %3x%* and %3y%*.
%3x%* is to be an individual; %3y%* is to be a sequence;
%3member%* is to return
%et%* just in the case that %3x%* is  an element of the sequence %3y%*.
What does this specification tell us?
The predicate is partial. The recursion should be on the structure of %3y%*,
since it certainly can't be on %3x%*;
and termination (with value %ef%*) should occur if %3y%* is the empty sequence. 
If %3y%* is not
empty then it has a first element, call it %3z%*; compare %3z%* with %3x%*.
If these elements are identical then  %3member%* should return %et%*; otherwise
see if %3x%* occurs in the remainder of the sequence %3y%*.

.GROUP
Notes:

%21.%*  We cannot use %3eq%* to check equality since, though %3x%* is an individual,
there is no
reason to believe that the elements of %3y%* need be.

%22.%*  Recall that we can get the first element of a sequence with 
%3first%*,
and the rest of a list with %3rest%*.
.APART

So here's %3member%*:
.BEGIN TABIT2(16,32);SELECT 3;GROUP;
\member[x;y] <=\[null[y] → %ef%*;
\\ equal[first[y];x] → %et%*;
\\ %et%* → member[x;rest[y]]]

.END

Finally here is an arithmetic example to calculate the length of a sequence,
where length is defined to be the number of elements in the sequence.

.P118:
.BEGIN CENTERIT;SELECT 3;
←length[n] <= [null[n] → 0; %et%* → plus[1;length[rest[n]]]]
.END
.SS(Lists: representations of sequences,lists)
It is all well and good to be able to write LISP-like functions describing
operations on sequences. The algorithms are clean and understandable.
However if we wish to run these programs in a LISP environment, then
we had better be prepared to represent both the data structures and the
algorithms in terms understandable to LISP⊗↓Indeed if we wish LISP to
run on a conventional machine we had better be prepared to represent
LISP's data structures and algorithms in a manner understandable
to conventional  hardware. This task is the subject of future chapters
in the book.←. This is the problem of representation. Granted that we
could have overcome the problem by explicitly representing sequences directly
as LISP S-expressions and could have written functions in LISP which
used %3car-cdr%*-chains to directly manipulate the representations.
This misuse of representation is a common fault in 
LISP programming and its practice is to be discouraged.
First the resulting programs are much more difficult to read and debug and understand.
More important, the programs are explicitly tied to a specific representation
of the abstract data structure; if at some later date it is desired to
change the representation, then many programs will have to be rewritten.
In {yonss(P41)} we develop a complex algorithm for
differentiation on a class of polynomials, moving from a unclear
and highly representation-dependent formulation, to a clear concise
representation-independent algorithm. 

Obviously we will always have to supply a representational bridge between
the abstract data structures and algorithms, and their concrete counterparts.
One aspect of this study of data structures is to understand what is
required to build this bridge and how best to represent these requirements
in a programming language.

The first decision to be made is how to represent the abstract data structure;
how  should we represent sequences as S-expressions? Indeed how should we
choose representations in general? Usually there is not just one "best" 
representation. Some obvious considerations involve the difficulty of
implementing the primitive operations (constructors, selectors,
recognizers, and predicates) on the abstract data structure. Also we must
keep in mind the kinds of algorithms which we wish to write;
computation takes time, and since this is computer science we should
give consideration to efficiency.

.GROUP
A reasonable choice for a representation of sequences as S-expressions
is the following:

.BEGIN CENTERIT;
←%9r%f(%1<indiv>%f)%1 = %9.%4<atom>%1
←and for %9α%41%1,#...,#%9α%4n%1 in %d<seq elem>%*:
.END
.BEGIN SELECT g;NOFILL;PREFACE 0 MILLS;GROUP;
.TURN OFF "%","#";
.TURN ON "?" FOR "\"
.TURN ON "∞" FOR "%";
.TABS 10,35;
?      ∞9r∞f(∞2 (∞9α∞41∞1, ..., ∞9α∞4n∞2) ∞f)∞1 = ∞G?         /\
?                     ?        /  ...
?                     ?   ∞9r∞f(∞2 ∞9α∞41∞2 ∞f)∞g  
?                     ?             ...
?                     ?              \
?                     ?              /\
?                     ?             /  ...
?                     ?        ∞9r∞f(∞2 ∞9α∞42∞2 ∞f)∞g  
?                     ?                 ...
?                     ?                   \
?                     ?                   /\
?                     ?                  /  \ 
?                     ?                 /    \ 
?                     ?                /     ∞3NIL∞*
?                     ?           ∞9r∞f(∞2 ∞9α∞4n∞2 ∞f) 
.END

.APART

.BEGIN TURN ON "#";
That is, the right-hand branch in this LISP-tree representation of a sequence
will always point to the rest of the sequence or will be the atom %3NIL%*.
Notice that the description of the %9r%*-mapping is recursive. Thus for example:

.GROUP;
.BEGIN SELECT g;NOFILL;PREFACE 0 MILLS;GROUP;
.TURN OFF "%","#";
.TURN ON "?" FOR "\"
.TURN ON "∞" FOR "%";
.TABS 15,37;
?    ∞9r∞f(∞2 ((A,B,C),(D)) ∞f)∞1 = ∞G?          /\
?                     ?         /  \
? 		      ?        /   ...
?                     ?  ∞9r∞f(∞2 (A,B,C) ∞f)∞g
?                     ?              ...
?                     ?               \
?		      ?               /\
?	 	      ?              /  \
?		      ?             /   ∞3NIL∞*
?                     ?        ∞9r∞f(∞2 (D) ∞f)∞g
.END
which will finally expand to %3((A .(B . (C . NIL))) .((D . NIL) . NIL)))%* 
since %9r%f(%2#(A,B,C)#%f)%1 is %3(A#.(B#.(C#.NIL)))%1 and
%9r%f(%2#(D)#%f)%1 is %3(D#.#NIL)%*.
.APART

You should become fluent in translating between
S-expr notation and sequence notation.
For convenience sake we will carry over the sequence notation --#%2(A,#B,#C)%*#--
to that for the representation in LISP --#%3(A,#B,#C)%*#--⊗↓
%1Be aware that
%3A%* is an atom and %2A%* is a sequence element; they are not the same
data structure.←
thinking of %3(A,#B,#C)%* as an abbreviation for %3(A#.(B#.(C#.#NIL)))%*.
.END

Next,  what about a representation for the empty sequence? Looking at the
representation of a non-empty sequence it appears natural to take ⊗→%3NIL%*↔← as
%9r%f(%2()%f)%1 since after you have removed all the elements from the sequence
%3NIL%* is all that is left in the representation.
To be 
consistent  then:
.BEGIN CENTERIT;
←%9r%f(%2#()#%f)%1 = %3NIL%*
.END
This gives us a complete specification of the %9r%*-mapping for the domain;
we have represented the abstract domain of sequences in a subset of the
domain of Symbolic Expressions. 
The S-expr representation of a sequence is called a %2⊗→list↔←%*; and we
will refer to the abbreviation, 
.BEGIN CENTERIT;GROUP;
←%3(%9α%41%3,#...,#%9α%4n%3)%1 for

←%3(%9α%41%3#.(%9α%42%3#.##...#(%9α%4n%3#.#NIL)#...)%1 as %2⊗→list-notation↔←%*. 
.END
Thus sequences are the abstract data structure; lists are their representation.
Since the atom %3NIL%* takes on special significance in list-notation
it is endowed with the special name %2⊗→list terminator↔←%*.
.GROUP
And
a notational point:  in graphical interpretation of list-notation 
it is often convenient to write:

.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;GROUP;
.TURN OFF "%","#";
.TURN ON "∞" FOR "%";
.turn on "?" FOR "\"
.TABS  20,45;
?⊂αααπααααα⊃         ? ⊂αααπαα⊃
?~   ~ NIL ~    ∞1as∞* ? ~   ~≤'~
?%ααα∀ααααα$         ? %ααα∀αα$

.END
.APART
%1
.GROUP
Thus, for example %3(A, (B, C), D)%* is:
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;GROUP;
.TURN OFF "%","#";
.TURN ON "∞" FOR "%";

		⊂αααπααα⊃  ⊂αααπααα⊃  ⊂αααπαα⊃
		~ A ~ #αβα→~ # ~ #αβα→~ D ~≤'~
		%ααα∀ααα$  %αβα∀ααα$  %ααα∀αα$
			     ~
			     ~   ⊂αααπααα⊃  ⊂αααπαα⊃
			     %αα→~ B ~ #αβα→~ C ~≤'~
				 %ααα∀ααα$  %ααα∀αα$

.END
.APART
or, in "dotted-pair" notation: %3 (A .((B .(C . NIL)).(D . NIL)))%1.
Finally, in list-notation the commas can be replaced by spaces⊗↓This convention
is one of the few instances of a "good" bug. The early LISP papers required
full use of commas, but due to a programming error in the LISP output routine,
lists were printed without commas. It looked so much better that the bug
became institutionalized.←
.BEGIN CENTERIT;
%1←e.g. %3(A, (B, C), D) = (A (B C) D).

%1but beware: the "dots" in dot-notation are %2never%* optional!

←%1that is   %3(A. (B. C)) %a≠%* (A (B C)).
.END

Let's take stock of our position: We have an intuitive understanding
of what we mean by "sequence". We have described selectors,  constructors,
and a recognizers, albeit at an abstract level, for manipulating
sequences. We have  represented our notion of sequences
as a subset of the S-expressions called lists.
Clearly the final step is to represent our
sequence-manipulators as certain LISP functions.
Let %3first%4r%1 be a LISP function which will represent the sequence 
operation %3first%*⊗↓Indeed, once the %9r%*-mapping is defined
on the %2domain%* it is induced on the %2operations%*.←.
Then for example we might expect:

.BEGIN CENTERIT SELECT 3; TURN ON "#";
←%3first%4r%*[(A, B, C)] = %9r%f(%3 first %f)%3 = A%1⊗↓Notice that we have written %3(A, B, C)%* 
instead of %2(A, B, C)%*.←
.BEGIN FILL;SELECT 1;
The problem is that this line is not quite right.
LISP functions  expect their inputs to be S-expressions
but %3(A,#B,#C)%* is %2not%* an S-expression.
To be correct we %2should%* have written:
.END

←%3first%4r%*[(A .(B . (C . NIL)))] = A%*

.END
It might be argued that %3(A, B, C)%* is just a convenient abbreviation
for the ugly %3(A#.#(B#.#(C#.#NIL)))%*, but even so, if we wish the
machine to understand and use the abbreviation we must examine the
implications of the notation. 
Clearly it is more reasonable to read and write in list notation.
As long as we perform only list-operations on lists there is 
no reason to look at the underlying dotted-pair representation⊗↓Indeed,
a strong case can be made for %2never%* allowing any operations on lists
%2except%1 list operations! See the discussion of type-faults on {yon(P131)}
and {yon(P132)}.←.
However it must be remembered that list operations are carried out on the
machine using the dotted-pair representation. 
%2We%* might carry out the "list-to-dotted-pair"
 transformations implicitly, but a machine which evaluates LISP
expressions will have to have an explict transformation mechanism.

So a convenient or even necessary part of our representation of
sequences is the specification of transformations between the
abstract data structure notation and the notation of the underlying
representation.
.BEGIN TURN ON "←";
Next we can give representations for the sequence operations.
To be precise we should continue our convention of writing 
the subscript %4r%* on the LISP representation of a sequence operation; thus
%3seq%* is represented by %3seq%4r%1. In most circumstances no confusion is
likely, so we will usually omit the subscript.

In our representation, the construction
of a sequence from an arbitrary number of elements 
will be represented by a LISP function
%3seq%4r%1. We will use %3list%* interchangeably with %3seq%4r%1, since
historically %3list%* takes precedence.
.BEGIN CENTERIT;
←%9r%f(%3 seq %f)%* = list
.END

⊗→%3list↔←%3[%9α%41%3; %9α%42%3; ... ;%9α%4n%3]%1 generates 
a list consisting of the 
%9α%4i%1 arguments. That is, 
%3list%1 
is the appropriately nested composition of %3cons%*es:

←%3cons[%9α%41%3;cons[%9α%42%3; ... cons[%9α%4n%3;NIL]] ...] %1.

.BEGIN GROUP;NOFILL;
Examples:
.select 3;
←list[A;B] = (A B)
←list[A;B;C] = (A B C)
←list[cons[A;B];car[(A . B)]] = ((A . B) A)
←list[A;list[B;C]] = list[A;(B C)] = (A (B C))
←list[] = ()
←list[NIL] = (NIL)
.END

Notice that %3list%* is %2not%* strictly a function in the LISP sense; 
it %2does%* evaluate its arguments,
but it can take an arbitrary number of them. See {yon(P111)}.
For the moment, %3list%* is simply a notational abbreviation for the nested %3cons%*.
A final practical note: some people
restrict lists so that elements of a list are either atomic or are lists themselves.
Even though our BNF definitions of sequences ({yon(P114)})
conform to this restriction, in 
practice we will allow elements of a list to be arbitray S-expressions;
thus %3(A,#(A#.#B),#C)%* is a list of three elements.
But beware; %2(A,#(A#.#B),#C)%* is %2not%* a sequence.

The representation of the selector functions should be apparent from the
graphical representation. 
We leave it as an exercise for the reader to specify representations 
for these functions; however here are a few of the other representations:

.BEGIN CENTERIT;
←%9r%f(%3 isindiv %f)%* = atom

←%9r%f(%3 isseq %f)%* = islist%1  where:

←%3islist[x] <= [atom[x] →[eq[x;NIL]→%et%*;%et%*→%ef%*];%et%*→islist[cdr[x]] ]
.END

.END
.SELECT 1;
.P116:

To summarize the accomplishments of this section, we have  in
effect added a %2new%* data structure to the repertoire of LISP.
The addition process included:

.BEGIN INDENT 0,10,10;

%21.  The abstract operations.%* We give constructors, selectors, and predicates for
the recognition of instances of the data structure.

%22.  The underlying representation.%* We must show how the new data
structure can be represented in terms of existing data structures.

%23.  Abstract operations as concrete operations.%* We must write LISP
functions which faithfully mirror the intended meaning of the abstract operations
when interpreted in the underlying representation.

%24.  The input/output transformations.%* We should give conventions
for transforming to and from the internal representation.

.END
There is another view of this problem of representability of  data structures
due to J. Morris. Here we use the notion of transfer functions whose purpose
is to supply mappings between the abstract structure and its representation.
Once the transfer functions are given it is usually easy to supply appropriate
implementations of the abstract primitives. We need two transfer functions;
a write-function, %aW%*, to map the representations into the abstract objects;
and a read-function, %aR%*, to map the abstract objects to their representations.

For the problem at hand, representing sequences, we want %aR%* to map from
elements of %d<seq elem>%* to %d<sexpr>%* (see {yon(P114)} and {yon(P115)}); and
we want %aW%* to map from %d<sexpr>%* to %d<seq elem>%*.
Before we give such %aR%* and %aW%*, let's see what they will do for us.
We could define %3first%4r%1 such that:
.BEGIN CENTERIT;
←%3first%4r%*[x] = %aW%*(car[%aR%*(x)])
.END
Here we have used "(...)" for function application rather than "[...]" which
is reserved for LISP evaluation. What the equation says is that given a 
list %3x%*, we can map it to the S-expression representation using %aR%*;
the result of this map is an S-expr and therefore suitable fare for %3car%*'s
delicate palate; the result of the %3car%* operation is then mapped back into
the set of list elements⊗↓%1lists or individuals← by %aW%*.
The other operations for manipulating sequences can be described similarly.
With this introduction, here are  appropriate transfer functions:

.BEGIN TABIT1(15);SELECT 3;
.GROUP

%aW%*(e) <=\[isnil[e] → mknull[];
\ atom[e] → mkindiv[e];
\ %et%* → concat[%aW%*(car[e]);%aW%*(cdr[e])]
\ ]
.END
.BEGIN TABIT1(15);SELECT 3;
.GROUP

%aR%*(l) <=\[null[l] → NIL;
\ isindiv[l] → atomize[l];
\ %et%* → cons[%aR%*(first[l]);%aR%*(rest[l])] 
\ ]

.END
We have seen all of the functions and predicates involved in %aR%* and %aW%*
except the two %3atomize%* and %3mkindiv%*. If you think about the implementation
of these two functions you would see that they are the identity functions
for all practical purposes. However they %2are%* only because of the
representations that we happened to pick; they need not be so simple.
A more careful  inspection would show that %3mkindiv%* expects as input an atomic
S-expression and outputs a sequence individual; %3atomize%* acts
conversely. If the representations of the atomic S-expressions were different
from the representations of sequence individuals, then we would have some
work to do.

The scheme of transfer functions is a more mathematical approach to  that
outlined above in %21.%*-%24.%*. We find %21.%*-%24.%* preferable since
it gives a description more in line with what we should expect to implement
in a programming language.

To finally review what has transpired since it is a model of what is to come:
we developed a new  abstract data structure called sequences, discussed
notational conventions for writing sequences; described
operations and pertinent control devices for writing algorithms;
finally we showed that
it was possible to represent sequences in the previously developed
domain of S-exprs. Thus if we had a machine which could execute
S-expr algorithms we could encapsulate that machine within the %9r%*-mapping
such that we could write in list-notation and have it translated internally to
S-expr form; we could write list-algorithms and have them execute correctly
using the %9r%*-maps of the sequence primitives; and finally it would
produce list-output rather thant the internal S-expr form. To all intents
and purposes our augmented LISP  machine understands sequences.
Indeed, this is the way most LISP implementations are organized; input 
may either be in S-expr form or list-notation; internally all data structures
are stored as S-exprs; all algorithms operate on the S-expr form; and finally,
S-exprs which can be interpreted as lists are output in list-notation.

We will approach the other abstract data structure problems in a similar manner,
first developing the data structures independent on their representation, and
later showing how to  represent this new domain in terms of some previously 
understood domain. We will see in {yonss(P105)} that much of the
mapping from input through output can be specified in a natural style
and LISP can automatically generate the necessary input and output
programs.
.CENT(Problems)

I Discuss %3cons[%9α%41%3;cons[%9α%42%3;%9α%43%3]]%1 
as a representation for %3(%9α%41%3, %9α%42%3, %9α%43%3)%1.

.REQUIRE "PROB4.PUB" SOURCE_FILE;
.SS(A respite)
.SELECT 1;
.BEGIN TURN ON "←";
This section contains some hints and notes on the introductory material of
this chapter. First a reiteration of a previous admonition: though most of
this material seems quite straightforward, the next chapter will begin
to show you that things are not all that trivial.  LISP is quite powerful.
The preceding material %2is%* basic and the sooner it becomes second nature
to you the better.  

A second admonition: besides learning about the basic constructs of the language,
the previous material should begin to convince you of the necessity for
precise specification of programming languages. 
In particular we have seen that the process of evaluation of expressions
must be spelled out quite carefully. Different evaluation schemes lead to
quite different programs. Since evaluation %2is%* the business of programming
languages  we'd better do all we can to make a precise specification.

And a final warning: 
a major point of this whole book is to stress the proper respect for abstraction
as a tool for controlling complexity in programming, and as a means of
writing implementation (representation) independent programs. As we begin
writing more complex algorithms, the power of abstraction will become more
apparent, but the lessons we learned in representing
sequences contain the essential ideas of abstraction and representation.
Do not forget them.

We have now seen two examples of abstract data structures. First, the study
of Symbolic Expressions was begun without any regard for their implementation.
They were deemed to be objects of sufficient interest in their own right⊗↓But
if there is some nagging doubt about the problems of implementation,
relax; we'll see plenty of this kind of thing later.←. The basic components
of our study were the data structures, themselves; the operations on the
data structures (%3car, cdr, cons, eq%* and %3atom%*), 
and finally the %2⊗→control structures↔←%* needed by the algorithms
which operated on the data structures. The control structures for
our LISP algorithms  are the conditional expression and recursion. They
are called control structures since they are used to direct the flow
of the algorithm as it executes.
Indeed these three components: data, operations, and control are the
main ingredients of any programming language.
Most languages have a superficially richer class of control devices;
"while"-statements and "DO"-loops are examples. Most control structures
are explicit language constructs, whereas recursion is implicit, with
few languages requiring some kind of declaration to the effect that a
procedure is recursive.

As we introduce each new abstract data structure we add new operations tailored
to its needs. In adding sequences we introduced %3first, rest, null,#...%*.
We should also consider the question of introducing new control structures.
Again, with sequences we stayed with recursion, though perhaps
a simpler control structure which went down the sequence, selecting elements
might be more in keeping with the data structure.
There is a
natural relationship between data structure and control structure;
sometimes we can exploit it to good measure. Looking ahead, the
iterator %3lit%* on {yon(P136)} is an example of a control regime
applicable to sequences.
When we consider abstract data structures in future chapters we will again 
see the three components: data, operations, and control. 

The new feature which we considered in discussing sequences was the problem
of representation. We showed how to represent sequences in terms of
S-expressions. We will continue this pyramiding of data structures in the future;
we will consider our work done as soon as we have a representation
of our new data structure in terms of an existing one. Finally
the suspense will become too great and we will exhibit a representation
of the underlying layer of S-expressions. Even later we will discuss
efficient representation of data structures, independent of their possible
S-expression representation. Let's face it; there are lots of data
structures in the world which are not best represented as S-expressions.
a further consideration appears because of the representational issue;
even though we have represented a particular data structure as a 
complex S-expression we have %2no%* business operating on that representation
with S-expression functions. Please don't use %3car%* and %3cdr%*  on 
lists even though you know what the representation is.
You  might have noticed
that in our representation of lists we can find the n%8th%* 
element in a list  by using %3cad%8n-1%*r%1. And you know
%3cadr%* is the second element, %3cdr%* is the %3rest%* of the list.
But please keep it a secret. These representation-dependent coding
tricks⊗↓called "puns" by C. Strachey← are dangerous. They are
really type-faults as discussed on {yon(P131)} and {yon(P132)}.

Well, what does all this have to do with a course in data structures?
We will show quite soon that we can exploit abstraction as a means for
giving a clear specification of evaluation of LISP expressions, and
the  representational techniques we will use will involve applications
of abstract data structures. A more tangible benefit perhaps should be
an increased awareness of the structure and behavior of programming languages,
and hopefully the beginnings of a better style of programming.

.P147:
Another part of our investigation should be to answer the question
"What is a data structure?".
As we mentioned at the beginning of {yonss(P100)}  there is a different 
characterization of sequences which will give a different interpretation
of data structures. The standard  mathematical definition of
a sequence is as a function from the integers to a particular domain.
Thus a finite sequence %ds%* might be given as:
.BEGIN TURN OFF "{","}";CENTERIT;
←%ds%1 = %*{<1, s%41%*>, <2, s%42%*>, ...<n, s%4n%*>}
.END
And to select components of %ds%*, we then use ordinary function application:
%ds(i)%1#=#%*s%4i%1. Indeed, if you have programmed in a language which 
has array constructs, you will recognize "application" as 
the style of notation used.
However this is quite different form what we did in the section on sequences.
For example, if %2(A,#B,#C)%* is a sequence, %ds%*, then in the new interpretation
we should write:
.BEGIN TURN OFF "{","}";CENTERIT;
←%ds%1 = %*{<1, %2A%*>, <2, %2B%*>,<3, %2C%*>}
.END
Thus %ds(2)%1 is %2A%*, etc. What has happened is that what was previously
considered to be a data structure has become a function, and the selector
functions on the data structure have now become static indices on the
function.
Or to make things more transparent:
.BEGIN TURN OFF "{","}";CENTERIT;
←%ds%1 = %*{<%3first%*, %2A%*>, <%3second%*, %2B%*>,<%3third%*, %2C%*>}
.END
Then we  would write %ds(%3first%*)%1 rather than %3first%d(s)%1.
This idea can easily be applied to S-exprs and their functions. 
In graphical terms
we are representing the structures such that the arcs of the graph
are labeled with the selector indices; with L-trees 
the labeling was implicit: left-branch was %3car%*; right-branch was %3cdr%*.
With explicit labels on the branches, the trees become unordered.
Several languages implement such unordered trees; they are called
%3structure%*s in Algol68 and EL1, and called %3record%*s in Pascal.
Several formalism exploit this view of data structures, in 
particular the Vienna Definition Language which is a direct descendant of LISP
represents its data in such a manner.
What then is a  data structure. It depends on how you look at it. For our
immediate purposes we will try to remain intuitive and informal.
We will try to characterize an abstract data structure as a domain and
a collection of associated operations and  control structures. The operations and
control mechanisms should allow us to describe algorithms in a natural manner
but should, if at all possible, remain representation independent.

Now for some more mundane tips and tricks on LISP programming:
When evaluating or writing functions or (predicates) %2always%* keep in mind
any  restrictions of the function:  is it partial? is it total? defined only
for lists? are there restrictions on arguments? When taking %3car%* or %3cdr%* is the
argument non-atomic?

A few tricks were embedded in the problem sets.
Recall problem %28%* on {yon(P11)}. The composition %3atom[cons[ ...]]%*
will always evaluate to %ef%* ⊗↓%1If it has a value at all!
If the computation of the arguments to the %3cons%* does not
terminate (recall the %aCBV%* scheme on {yon(P121)}) then we obviously won't get %ef%*.←
since the result of %3cons%*-ing is always
non-atomic.
In %210.%*, we used atoms with  the same
letter strings as predicate names, %3ATOM%* and %3EQ%*. %3ATOM%* and %3EQ%*
are perfectly good atoms, and are %2not%* the LISP predicates.
%214.%* shows that predicates are perfectly good expressions to evaluate; e%41%*
is a predicate. Similarly,  %216.%*  shows that some conditional expressions may
appear within a functional composition.

.BEGIN TURN ON "#";
Notice that %3twist%* in II is total whereas %3findem%* is partial.
%3findem%* is partial since %3y%* must be atomic. Both functions build 
new trees, %3twistem%* reverses left- and right-branches recursively;
%3findem%* builds a tree with the same branching structure as %3x%*,
but the terminal nodes contain %3T%* at the points where the atom %3y%*
appears in the original tree, and %3NIL%* otherwise.
.P24:
.END

Now for a representational trick:
on {yon(P12)} you should have discovered that the value of:

←%3cons[%9α%41%3,(%9α%42%3, ...%9α%4n%3)]%1 is: %3(%9α%41%3, %9α%42%3, ... %9α%4n%3).

%1Notice that %3list[%9α%41%3;(%9α%42%3, ... %9α%4n%3)]%1 is 
%3(%9α%41%3 (%9α%42%3 ... %9α%4n%3))%1.
So %3cons%* will add a new element to the front of an existing list. %3list%*
will create a new list whose elements will be the values of the arguments
to %3list%*.

Be clear on the difference between the representation of the empty list, %3NIL%*, and 
the list consisting of %3NIL%*, %3(NIL)%*; %3(NIL)%* is 
an abbreviation for %3(NIL . NIL)%*, which certainly is not %3NIL%*.
List-notation is  an abbreviation and can always be translated back
into a S-expr.

And an admonition:
though our representation of sequences is such that %3first%*, %3rest%*
and %3concat%*
are identical to %3car%*, %3cdr%*, and %3cons%* respectively, we should
use the names %3first%*, %3rest%*, and %3concat%* 
to make it clear that we are operating on lists.

.END